Overview

Dataset statistics

Number of variables15
Number of observations44837
Missing cells40622
Missing cells (%)6.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.1 MiB
Average record size in memory120.0 B

Variable types

Numeric9
Text6

Alerts

popularity is highly overall correlated with vote_countHigh correlation
vote_count is highly overall correlated with popularityHigh correlation
id_collection has 40375 (90.0%) missing valuesMissing
popularity is highly skewed (γ1 = 29.46841147)Skewed
id has unique valuesUnique
budget has 35999 (80.3%) zerosZeros
runtime has 1506 (3.4%) zerosZeros
vote_average has 2880 (6.4%) zerosZeros
vote_count has 2784 (6.2%) zerosZeros

Reproduction

Analysis started2023-06-10 22:22:31.064524
Analysis finished2023-06-10 22:23:01.073202
Duration30.01 seconds
Software versionydata-profiling vv4.2.0
Download configurationconfig.json

Variables

id
Real number (ℝ)

Distinct44837
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean106925.11
Minimum2
Maximum469172
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:01.286195image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile5205.8
Q126203
median59115
Q3153854
95-th percentile354129
Maximum469172
Range469170
Interquartile range (IQR)127651

Descriptive statistics

Standard deviation111095.38
Coefficient of variation (CV)1.0390017
Kurtosis0.57769623
Mean106925.11
Median Absolute Deviation (MAD)43834
Skewness1.2884833
Sum4.7942013 × 109
Variance1.2342183 × 1010
MonotonicityNot monotonic
2023-06-10T19:23:01.524974image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
862 1
 
< 0.1%
182424 1
 
< 0.1%
87321 1
 
< 0.1%
291613 1
 
< 0.1%
103675 1
 
< 0.1%
171424 1
 
< 0.1%
337210 1
 
< 0.1%
113040 1
 
< 0.1%
244418 1
 
< 0.1%
114953 1
 
< 0.1%
Other values (44827) 44827
> 99.9%
ValueCountFrequency (%)
2 1
< 0.1%
3 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
11 1
< 0.1%
12 1
< 0.1%
13 1
< 0.1%
14 1
< 0.1%
15 1
< 0.1%
16 1
< 0.1%
ValueCountFrequency (%)
469172 1
< 0.1%
468343 1
< 0.1%
462788 1
< 0.1%
461955 1
< 0.1%
461805 1
< 0.1%
461615 1
< 0.1%
461533 1
< 0.1%
461089 1
< 0.1%
460870 1
< 0.1%
460822 1
< 0.1%

title
Text

Distinct41740
Distinct (%)93.1%
Missing0
Missing (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:01.914111image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length105
Median length79
Mean length16.693557
Min length1

Characters and Unicode

Total characters748489
Distinct characters285
Distinct categories17 ?
Distinct scripts7 ?
Distinct blocks12 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique39470 ?
Unique (%)88.0%

Sample

1st rowToy Story
2nd rowJumanji
3rd rowGrumpier Old Men
4th rowWaiting to Exhale
5th rowFather of the Bride Part II
ValueCountFrequency (%)
the 14413
 
10.7%
of 4870
 
3.6%
a 2203
 
1.6%
in 1676
 
1.2%
and 1607
 
1.2%
to 1039
 
0.8%
749
 
0.6%
man 661
 
0.5%
love 655
 
0.5%
for 594
 
0.4%
Other values (24114) 106056
78.8%
2023-06-10T19:23:02.842251image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
89708
 
12.0%
e 75366
 
10.1%
a 48337
 
6.5%
o 45113
 
6.0%
n 40322
 
5.4%
r 39529
 
5.3%
i 39209
 
5.2%
t 36257
 
4.8%
s 29154
 
3.9%
h 28210
 
3.8%
Other values (275) 277284
37.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 527495
70.5%
Uppercase Letter 115863
 
15.5%
Space Separator 89708
 
12.0%
Other Punctuation 10349
 
1.4%
Decimal Number 3796
 
0.5%
Dash Punctuation 967
 
0.1%
Close Punctuation 86
 
< 0.1%
Open Punctuation 84
 
< 0.1%
Final Punctuation 38
 
< 0.1%
Other Letter 25
 
< 0.1%
Other values (7) 78
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 75366
14.3%
a 48337
9.2%
o 45113
 
8.6%
n 40322
 
7.6%
r 39529
 
7.5%
i 39209
 
7.4%
t 36257
 
6.9%
s 29154
 
5.5%
h 28210
 
5.3%
l 25609
 
4.9%
Other values (121) 120389
22.8%
Uppercase Letter
ValueCountFrequency (%)
T 15833
13.7%
S 10214
 
8.8%
M 7938
 
6.9%
B 7567
 
6.5%
C 7084
 
6.1%
A 6693
 
5.8%
D 6249
 
5.4%
L 5806
 
5.0%
H 5106
 
4.4%
W 5096
 
4.4%
Other values (63) 38277
33.0%
Other Letter
ValueCountFrequency (%)
ه 2
 
8.0%
ی 2
 
8.0%
چ 2
 
8.0%
ک 2
 
8.0%
1
 
4.0%
ج 1
 
4.0%
1
 
4.0%
1
 
4.0%
1
 
4.0%
ا 1
 
4.0%
Other values (11) 11
44.0%
Other Punctuation
ValueCountFrequency (%)
: 3667
35.4%
' 2471
23.9%
. 1585
15.3%
, 1112
 
10.7%
! 642
 
6.2%
& 454
 
4.4%
? 265
 
2.6%
/ 77
 
0.7%
* 19
 
0.2%
# 13
 
0.1%
Other values (8) 44
 
0.4%
Decimal Number
ValueCountFrequency (%)
2 851
22.4%
1 691
18.2%
0 608
16.0%
3 473
12.5%
9 228
 
6.0%
4 225
 
5.9%
5 219
 
5.8%
7 190
 
5.0%
8 157
 
4.1%
6 154
 
4.1%
Math Symbol
ValueCountFrequency (%)
+ 17
73.9%
× 2
 
8.7%
= 1
 
4.3%
1
 
4.3%
1
 
4.3%
1
 
4.3%
Other Number
ValueCountFrequency (%)
½ 12
63.2%
² 3
 
15.8%
³ 2
 
10.5%
1
 
5.3%
1
 
5.3%
Other Symbol
ValueCountFrequency (%)
° 3
37.5%
2
25.0%
1
 
12.5%
1
 
12.5%
1
 
12.5%
Currency Symbol
ValueCountFrequency (%)
$ 18
85.7%
¢ 2
 
9.5%
£ 1
 
4.8%
Dash Punctuation
ValueCountFrequency (%)
- 952
98.4%
15
 
1.6%
Close Punctuation
ValueCountFrequency (%)
) 81
94.2%
] 5
 
5.8%
Open Punctuation
ValueCountFrequency (%)
( 79
94.0%
[ 5
 
6.0%
Final Punctuation
ValueCountFrequency (%)
37
97.4%
1
 
2.6%
Initial Punctuation
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
89708
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 3
100.0%
Format
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 642851
85.9%
Common 105106
 
14.0%
Cyrillic 338
 
< 0.1%
Greek 170
 
< 0.1%
Arabic 11
 
< 0.1%
Katakana 8
 
< 0.1%
Han 5
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 75366
 
11.7%
a 48337
 
7.5%
o 45113
 
7.0%
n 40322
 
6.3%
r 39529
 
6.1%
i 39209
 
6.1%
t 36257
 
5.6%
s 29154
 
4.5%
h 28210
 
4.4%
l 25609
 
4.0%
Other values (106) 235745
36.7%
Common
ValueCountFrequency (%)
89708
85.4%
: 3667
 
3.5%
' 2471
 
2.4%
. 1585
 
1.5%
, 1112
 
1.1%
- 952
 
0.9%
2 851
 
0.8%
1 691
 
0.7%
! 642
 
0.6%
0 608
 
0.6%
Other values (50) 2819
 
2.7%
Cyrillic
ValueCountFrequency (%)
о 32
 
9.5%
е 32
 
9.5%
а 27
 
8.0%
н 23
 
6.8%
и 22
 
6.5%
р 22
 
6.5%
к 17
 
5.0%
с 15
 
4.4%
т 14
 
4.1%
в 13
 
3.8%
Other values (37) 121
35.8%
Greek
ValueCountFrequency (%)
α 20
 
11.8%
ο 14
 
8.2%
ι 14
 
8.2%
τ 9
 
5.3%
ά 8
 
4.7%
λ 8
 
4.7%
ρ 8
 
4.7%
ν 7
 
4.1%
ε 6
 
3.5%
η 6
 
3.5%
Other values (32) 70
41.2%
Katakana
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Arabic
ValueCountFrequency (%)
ه 2
18.2%
ی 2
18.2%
چ 2
18.2%
ک 2
18.2%
ج 1
9.1%
ا 1
9.1%
س 1
9.1%
Han
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 746957
99.8%
None 1099
 
0.1%
Cyrillic 338
 
< 0.1%
Punctuation 62
 
< 0.1%
Arabic 11
 
< 0.1%
Katakana 8
 
< 0.1%
CJK 5
 
< 0.1%
Misc Symbols 3
 
< 0.1%
Letterlike Symbols 2
 
< 0.1%
Math Operators 2
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
89708
 
12.0%
e 75366
 
10.1%
a 48337
 
6.5%
o 45113
 
6.0%
n 40322
 
5.4%
r 39529
 
5.3%
i 39209
 
5.2%
t 36257
 
4.9%
s 29154
 
3.9%
h 28210
 
3.8%
Other values (76) 275752
36.9%
None
ValueCountFrequency (%)
é 212
19.3%
ä 126
 
11.5%
ö 55
 
5.0%
è 52
 
4.7%
ô 44
 
4.0%
ü 38
 
3.5%
ó 36
 
3.3%
ı 35
 
3.2%
á 34
 
3.1%
í 32
 
2.9%
Other values (107) 435
39.6%
Punctuation
ValueCountFrequency (%)
37
59.7%
15
24.2%
5
 
8.1%
2
 
3.2%
1
 
1.6%
1
 
1.6%
1
 
1.6%
Cyrillic
ValueCountFrequency (%)
о 32
 
9.5%
е 32
 
9.5%
а 27
 
8.0%
н 23
 
6.8%
и 22
 
6.5%
р 22
 
6.5%
к 17
 
5.0%
с 15
 
4.4%
т 14
 
4.1%
в 13
 
3.8%
Other values (37) 121
35.8%
Arabic
ValueCountFrequency (%)
ه 2
18.2%
ی 2
18.2%
چ 2
18.2%
ک 2
18.2%
ج 1
9.1%
ا 1
9.1%
س 1
9.1%
Misc Symbols
ValueCountFrequency (%)
2
66.7%
1
33.3%
Letterlike Symbols
ValueCountFrequency (%)
1
50.0%
1
50.0%
CJK
ValueCountFrequency (%)
1
20.0%
1
20.0%
1
20.0%
1
20.0%
1
20.0%
Number Forms
ValueCountFrequency (%)
1
100.0%
Katakana
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
Math Operators
ValueCountFrequency (%)
1
50.0%
1
50.0%
Arrows
ValueCountFrequency (%)
1
100.0%

budget
Real number (ℝ)

Distinct1215
Distinct (%)2.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4273734.5
Minimum0
Maximum3.8 × 108
Zeros35999
Zeros (%)80.3%
Negative0
Negative (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:03.279843image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile25000000
Maximum3.8 × 108
Range3.8 × 108
Interquartile range (IQR)0

Descriptive statistics

Standard deviation17532226
Coefficient of variation (CV)4.1023198
Kurtosis65.92765
Mean4273734.5
Median Absolute Deviation (MAD)0
Skewness7.0821562
Sum1.9162143 × 1011
Variance3.0737894 × 1014
MonotonicityNot monotonic
2023-06-10T19:23:03.720435image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 35999
80.3%
5000000 285
 
0.6%
10000000 258
 
0.6%
20000000 242
 
0.5%
2000000 237
 
0.5%
15000000 225
 
0.5%
3000000 223
 
0.5%
25000000 206
 
0.5%
1000000 197
 
0.4%
30000000 189
 
0.4%
Other values (1205) 6776
 
15.1%
ValueCountFrequency (%)
0 35999
80.3%
1 25
 
0.1%
2 13
 
< 0.1%
3 9
 
< 0.1%
4 7
 
< 0.1%
5 7
 
< 0.1%
6 5
 
< 0.1%
7 4
 
< 0.1%
8 5
 
< 0.1%
9 1
 
< 0.1%
ValueCountFrequency (%)
380000000 1
 
< 0.1%
300000000 1
 
< 0.1%
280000000 1
 
< 0.1%
270000000 1
 
< 0.1%
260000000 3
 
< 0.1%
258000000 1
 
< 0.1%
255000000 1
 
< 0.1%
250000000 10
< 0.1%
245000000 2
 
< 0.1%
237000000 1
 
< 0.1%
Distinct89
Distinct (%)0.2%
Missing11
Missing (%)< 0.1%
Memory size350.4 KiB
2023-06-10T19:23:04.310888image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters89652
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique17 ?
Unique (%)< 0.1%

Sample

1st rowen
2nd rowen
3rd rowen
4th rowen
5th rowen
ValueCountFrequency (%)
en 31894
71.2%
fr 2385
 
5.3%
it 1515
 
3.4%
ja 1340
 
3.0%
de 1063
 
2.4%
es 976
 
2.2%
ru 782
 
1.7%
hi 503
 
1.1%
ko 442
 
1.0%
zh 403
 
0.9%
Other values (79) 3523
 
7.9%
2023-06-10T19:23:04.899341image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 34187
38.1%
n 32592
36.4%
r 3538
 
3.9%
f 2770
 
3.1%
i 2361
 
2.6%
t 2228
 
2.5%
a 1818
 
2.0%
s 1626
 
1.8%
j 1341
 
1.5%
d 1305
 
1.5%
Other values (16) 5886
 
6.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 89652
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 34187
38.1%
n 32592
36.4%
r 3538
 
3.9%
f 2770
 
3.1%
i 2361
 
2.6%
t 2228
 
2.5%
a 1818
 
2.0%
s 1626
 
1.8%
j 1341
 
1.5%
d 1305
 
1.5%
Other values (16) 5886
 
6.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 89652
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 34187
38.1%
n 32592
36.4%
r 3538
 
3.9%
f 2770
 
3.1%
i 2361
 
2.6%
t 2228
 
2.5%
a 1818
 
2.0%
s 1626
 
1.8%
j 1341
 
1.5%
d 1305
 
1.5%
Other values (16) 5886
 
6.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 89652
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 34187
38.1%
n 32592
36.4%
r 3538
 
3.9%
f 2770
 
3.1%
i 2361
 
2.6%
t 2228
 
2.5%
a 1818
 
2.0%
s 1626
 
1.8%
j 1341
 
1.5%
d 1305
 
1.5%
Other values (16) 5886
 
6.6%

popularity
Real number (ℝ)

HIGH CORRELATION  SKEWED 

Distinct43251
Distinct (%)96.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.9368967
Minimum0
Maximum547.4883
Zeros39
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:05.344928image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.0214034
Q10.393975
median1.138972
Q33.732537
95-th percentile11.073024
Maximum547.4883
Range547.4883
Interquartile range (IQR)3.338562

Descriptive statistics

Standard deviation6.0118511
Coefficient of variation (CV)2.047008
Kurtosis1943.3744
Mean2.9368967
Median Absolute Deviation (MAD)0.975422
Skewness29.468411
Sum131681.64
Variance36.142354
MonotonicityNot monotonic
2023-06-10T19:23:05.563633image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 × 10-654
 
0.1%
0.000308 42
 
0.1%
0 39
 
0.1%
0.00022 39
 
0.1%
0.001177 38
 
0.1%
0.000578 38
 
0.1%
0.000844 38
 
0.1%
0.002001 27
 
0.1%
0.003013 21
 
< 0.1%
0.00353 19
 
< 0.1%
Other values (43241) 44482
99.2%
ValueCountFrequency (%)
0 39
0.1%
1 × 10-654
0.1%
2 × 10-65
 
< 0.1%
3 × 10-65
 
< 0.1%
4 × 10-65
 
< 0.1%
5 × 10-61
 
< 0.1%
6 × 10-62
 
< 0.1%
7 × 10-61
 
< 0.1%
8 × 10-66
 
< 0.1%
9 × 10-62
 
< 0.1%
ValueCountFrequency (%)
547.488298 1
< 0.1%
294.337037 1
< 0.1%
287.253654 1
< 0.1%
228.032744 1
< 0.1%
213.849907 1
< 0.1%
187.860492 1
< 0.1%
185.330992 1
< 0.1%
185.070892 1
< 0.1%
183.870374 1
< 0.1%
154.801009 1
< 0.1%

runtime
Real number (ℝ)

Distinct353
Distinct (%)0.8%
Missing236
Missing (%)0.5%
Infinite0
Infinite (%)0.0%
Mean94.349432
Minimum0
Maximum1256
Zeros1506
Zeros (%)3.4%
Negative0
Negative (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:05.785298image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile13
Q185
median95
Q3107
95-th percentile138
Maximum1256
Range1256
Interquartile range (IQR)22

Descriptive statistics

Standard deviation38.238363
Coefficient of variation (CV)0.40528451
Kurtosis95.911275
Mean94.349432
Median Absolute Deviation (MAD)11
Skewness4.5804385
Sum4208079
Variance1462.1724
MonotonicityNot monotonic
2023-06-10T19:23:06.016421image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90 2521
 
5.6%
0 1506
 
3.4%
100 1460
 
3.3%
95 1391
 
3.1%
93 1197
 
2.7%
96 1096
 
2.4%
92 1065
 
2.4%
94 1055
 
2.4%
91 1045
 
2.3%
97 1015
 
2.3%
Other values (343) 31250
69.7%
ValueCountFrequency (%)
0 1506
3.4%
1 100
 
0.2%
2 29
 
0.1%
3 39
 
0.1%
4 43
 
0.1%
5 49
 
0.1%
6 70
 
0.2%
7 98
 
0.2%
8 72
 
0.2%
9 59
 
0.1%
ValueCountFrequency (%)
1256 1
< 0.1%
1140 2
< 0.1%
931 1
< 0.1%
925 1
< 0.1%
900 1
< 0.1%
877 1
< 0.1%
874 1
< 0.1%
840 2
< 0.1%
780 1
< 0.1%
720 1
< 0.1%

vote_average
Real number (ℝ)

Distinct92
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.6280884
Minimum0
Maximum10
Zeros2880
Zeros (%)6.4%
Negative0
Negative (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:06.333364image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q15
median6
Q36.8
95-th percentile7.8
Maximum10
Range10
Interquartile range (IQR)1.8

Descriptive statistics

Standard deviation1.9086658
Coefficient of variation (CV)0.33913217
Kurtosis2.5747989
Mean5.6280884
Median Absolute Deviation (MAD)0.9
Skewness-1.5297195
Sum252346.6
Variance3.6430053
MonotonicityNot monotonic
2023-06-10T19:23:06.606109image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2880
 
6.4%
6 2422
 
5.4%
5 1962
 
4.4%
7 1855
 
4.1%
6.5 1705
 
3.8%
6.3 1589
 
3.5%
5.5 1366
 
3.0%
5.8 1352
 
3.0%
6.4 1342
 
3.0%
6.7 1328
 
3.0%
Other values (82) 27036
60.3%
ValueCountFrequency (%)
0 2880
6.4%
0.5 13
 
< 0.1%
0.7 1
 
< 0.1%
1 100
 
0.2%
1.1 1
 
< 0.1%
1.2 4
 
< 0.1%
1.3 12
 
< 0.1%
1.4 5
 
< 0.1%
1.5 30
 
0.1%
1.6 6
 
< 0.1%
ValueCountFrequency (%)
10 179
0.4%
9.8 1
 
< 0.1%
9.6 1
 
< 0.1%
9.5 18
 
< 0.1%
9.4 3
 
< 0.1%
9.3 18
 
< 0.1%
9.2 4
 
< 0.1%
9.1 2
 
< 0.1%
9 154
0.3%
8.9 7
 
< 0.1%

vote_count
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct1820
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean111.20153
Minimum0
Maximum14075
Zeros2784
Zeros (%)6.2%
Negative0
Negative (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:06.828674image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q13
median10
Q335
95-th percentile441
Maximum14075
Range14075
Interquartile range (IQR)32

Descriptive statistics

Standard deviation494.55158
Coefficient of variation (CV)4.4473451
Kurtosis149.17758
Mean111.20153
Median Absolute Deviation (MAD)8
Skewness10.380647
Sum4985943
Variance244581.27
MonotonicityNot monotonic
2023-06-10T19:23:07.042103image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 3183
 
7.1%
2 3071
 
6.8%
0 2784
 
6.2%
3 2728
 
6.1%
4 2431
 
5.4%
5 2072
 
4.6%
6 1720
 
3.8%
7 1541
 
3.4%
8 1350
 
3.0%
9 1182
 
2.6%
Other values (1810) 22775
50.8%
ValueCountFrequency (%)
0 2784
6.2%
1 3183
7.1%
2 3071
6.8%
3 2728
6.1%
4 2431
5.4%
5 2072
4.6%
6 1720
3.8%
7 1541
3.4%
8 1350
3.0%
9 1182
 
2.6%
ValueCountFrequency (%)
14075 1
< 0.1%
12269 1
< 0.1%
12114 1
< 0.1%
12000 1
< 0.1%
11444 1
< 0.1%
11187 1
< 0.1%
10297 1
< 0.1%
10014 1
< 0.1%
9678 1
< 0.1%
9634 1
< 0.1%

id_collection
Real number (ℝ)

Distinct1690
Distinct (%)37.9%
Missing40375
Missing (%)90.0%
Infinite0
Infinite (%)0.0%
Mean183639.64
Minimum10
Maximum480160
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:07.273888image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile2704
Q185960
median141286
Q3293196
95-th percentile439014.45
Maximum480160
Range480150
Interquartile range (IQR)207236

Descriptive statistics

Standard deviation141536.13
Coefficient of variation (CV)0.77072756
Kurtosis-0.92324343
Mean183639.64
Median Absolute Deviation (MAD)104025
Skewness0.53627669
Sum8.1940008 × 108
Variance2.0032477 × 1010
MonotonicityNot monotonic
2023-06-10T19:23:07.504358image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
415931 29
 
0.1%
645 26
 
0.1%
96887 26
 
0.1%
421566 26
 
0.1%
34055 25
 
0.1%
37261 22
 
< 0.1%
413661 21
 
< 0.1%
374509 16
 
< 0.1%
425164 15
 
< 0.1%
148324 15
 
< 0.1%
Other values (1680) 4241
 
9.5%
(Missing) 40375
90.0%
ValueCountFrequency (%)
10 8
< 0.1%
84 4
< 0.1%
119 3
 
< 0.1%
131 3
 
< 0.1%
151 6
< 0.1%
230 3
 
< 0.1%
263 3
 
< 0.1%
264 3
 
< 0.1%
295 5
< 0.1%
304 3
 
< 0.1%
ValueCountFrequency (%)
480160 1
 
< 0.1%
480071 1
 
< 0.1%
479971 1
 
< 0.1%
479888 2
 
< 0.1%
479692 2
 
< 0.1%
479549 1
 
< 0.1%
479319 13
< 0.1%
478947 2
 
< 0.1%
478628 12
< 0.1%
478545 1
 
< 0.1%
Distinct4035
Distinct (%)9.0%
Missing0
Missing (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:07.720948image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length98
Median length84
Mean length21.655619
Min length2

Characters and Unicode

Total characters970973
Distinct characters43
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2343 ?
Unique (%)5.2%

Sample

1st row['Animation', 'Comedy', 'Family']
2nd row['Adventure', 'Fantasy', 'Family']
3rd row['Romance', 'Comedy']
4th row['Comedy', 'Drama', 'Romance']
5th row['Comedy']
ValueCountFrequency (%)
drama 20074
20.9%
comedy 12986
13.5%
thriller 7564
 
7.9%
romance 6663
 
6.9%
action 6529
 
6.8%
horror 4624
 
4.8%
crime 4274
 
4.4%
documentary 3868
 
4.0%
adventure 3477
 
3.6%
science 3016
 
3.1%
Other values (37) 23182
24.1%
2023-06-10T19:23:08.415176image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
' 180280
18.6%
r 68492
 
7.1%
a 61215
 
6.3%
e 55235
 
5.7%
m 52509
 
5.4%
51420
 
5.3%
o 47965
 
4.9%
, 47635
 
4.9%
[ 44837
 
4.6%
] 44837
 
4.6%
Other values (33) 316548
32.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 507278
52.2%
Other Punctuation 227915
23.5%
Uppercase Letter 94686
 
9.8%
Space Separator 51420
 
5.3%
Open Punctuation 44837
 
4.6%
Close Punctuation 44837
 
4.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 68492
13.5%
a 61215
12.1%
e 55235
10.9%
m 52509
10.4%
o 47965
9.5%
i 39295
7.7%
n 35304
7.0%
y 28158
5.6%
c 27704
5.5%
t 25957
 
5.1%
Other values (12) 65444
12.9%
Uppercase Letter
ValueCountFrequency (%)
D 23942
25.3%
C 17263
18.2%
A 11909
12.6%
F 9640
10.2%
T 8322
 
8.8%
R 6665
 
7.0%
H 6009
 
6.3%
M 4793
 
5.1%
S 3020
 
3.2%
W 2355
 
2.5%
Other values (6) 768
 
0.8%
Other Punctuation
ValueCountFrequency (%)
' 180280
79.1%
, 47635
 
20.9%
Space Separator
ValueCountFrequency (%)
51420
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 44837
100.0%
Close Punctuation
ValueCountFrequency (%)
] 44837
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 601964
62.0%
Common 369009
38.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 68492
11.4%
a 61215
 
10.2%
e 55235
 
9.2%
m 52509
 
8.7%
o 47965
 
8.0%
i 39295
 
6.5%
n 35304
 
5.9%
y 28158
 
4.7%
c 27704
 
4.6%
t 25957
 
4.3%
Other values (28) 160130
26.6%
Common
ValueCountFrequency (%)
' 180280
48.9%
51420
 
13.9%
, 47635
 
12.9%
[ 44837
 
12.2%
] 44837
 
12.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 970973
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
' 180280
18.6%
r 68492
 
7.1%
a 61215
 
6.3%
e 55235
 
5.7%
m 52509
 
5.4%
51420
 
5.3%
o 47965
 
4.9%
, 47635
 
4.9%
[ 44837
 
4.6%
] 44837
 
4.6%
Other values (33) 316548
32.6%
Distinct22477
Distinct (%)50.1%
Missing0
Missing (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:08.893736image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length173
Median length161
Mean length10.078663
Min length2

Characters and Unicode

Total characters451897
Distinct characters14
Distinct categories5 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20148 ?
Unique (%)44.9%

Sample

1st row[3]
2nd row[559, 2550, 10201]
3rd row[6194, 19464]
4th row[306]
5th row[5842, 9195]
ValueCountFrequency (%)
11604
 
14.2%
6194 1246
 
1.5%
8411 1074
 
1.3%
4 1000
 
1.2%
306 834
 
1.0%
33 820
 
1.0%
441 447
 
0.5%
5358 432
 
0.5%
5 427
 
0.5%
6 290
 
0.4%
Other values (23501) 63375
77.7%
2023-06-10T19:23:09.643299image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
[ 44837
9.9%
] 44837
9.9%
1 44042
9.7%
, 36712
 
8.1%
36712
 
8.1%
2 32237
 
7.1%
3 31035
 
6.9%
4 29931
 
6.6%
6 27657
 
6.1%
5 27368
 
6.1%
Other values (4) 96529
21.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 288799
63.9%
Open Punctuation 44837
 
9.9%
Close Punctuation 44837
 
9.9%
Other Punctuation 36712
 
8.1%
Space Separator 36712
 
8.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 44042
15.3%
2 32237
11.2%
3 31035
10.7%
4 29931
10.4%
6 27657
9.6%
5 27368
9.5%
8 25441
8.8%
7 24143
8.4%
9 23909
8.3%
0 23036
8.0%
Open Punctuation
ValueCountFrequency (%)
[ 44837
100.0%
Close Punctuation
ValueCountFrequency (%)
] 44837
100.0%
Other Punctuation
ValueCountFrequency (%)
, 36712
100.0%
Space Separator
ValueCountFrequency (%)
36712
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 451897
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
[ 44837
9.9%
] 44837
9.9%
1 44042
9.7%
, 36712
 
8.1%
36712
 
8.1%
2 32237
 
7.1%
3 31035
 
6.9%
4 29931
 
6.6%
6 27657
 
6.1%
5 27368
 
6.1%
Other values (4) 96529
21.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 451897
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
[ 44837
9.9%
] 44837
9.9%
1 44042
9.7%
, 36712
 
8.1%
36712
 
8.1%
2 32237
 
7.1%
3 31035
 
6.9%
4 29931
 
6.6%
6 27657
 
6.1%
5 27368
 
6.1%
Other values (4) 96529
21.4%
Distinct42118
Distinct (%)93.9%
Missing0
Missing (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:10.210055image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length5179
Median length1500
Mean length215.10915
Min length2

Characters and Unicode

Total characters9644849
Distinct characters394
Distinct categories14 ?
Distinct scripts9 ?
Distinct blocks10 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique41917 ?
Unique (%)93.5%

Sample

1st row['Tom Hanks', 'Tim Allen', 'Don Rickles', 'Jim Varney', 'Wallace Shawn', 'John Ratzenberger', 'Annie Potts', 'John Morris', 'Erik von Detten', 'Laurie Metcalf', 'R. Lee Ermey', 'Sarah Freeman', 'Penn Jillette']
2nd row['Robin Williams', 'Jonathan Hyde', 'Kirsten Dunst', 'Bradley Pierce', 'Bonnie Hunt', 'Bebe Neuwirth', 'David Alan Grier', 'Patricia Clarkson', 'Adam Hann-Byrd', 'Laura Bell Bundy', 'James Handy', 'Gillian Barber', 'Brandon Obray', 'Cyrus Thiedeke', 'Gary Joseph Thorup', 'Leonard Zola', 'Lloyd Berry', 'Malcolm Stewart', 'Annabel Kershaw', 'Darryl Henriques', 'Robyn Driscoll', 'Peter Bryant', 'Sarah Gilson', 'Florica Vlad', 'June Lion', 'Brenda Lockmuller']
3rd row['Walter Matthau', 'Jack Lemmon', 'Ann-Margret', 'Sophia Loren', 'Daryl Hannah', 'Burgess Meredith', 'Kevin Pollak']
4th row['Whitney Houston', 'Angela Bassett', 'Loretta Devine', 'Lela Rochon', 'Gregory Hines', 'Dennis Haysbert', 'Michael Beach', 'Mykelti Williamson', 'Lamont Johnson', 'Wesley Snipes']
5th row['Steve Martin', 'Diane Keaton', 'Martin Short', 'Kimberly Williams-Paisley', 'George Newbern', 'Kieran Culkin', 'BD Wong', 'Peter Michael Goetz', 'Kate McGregor-Stewart', 'Jane Adams', 'Eugene Levy', 'Lori Alan']
ValueCountFrequency (%)
john 9721
 
0.8%
michael 7413
 
0.6%
david 6130
 
0.5%
robert 5689
 
0.5%
james 5637
 
0.5%
richard 4407
 
0.4%
paul 4296
 
0.4%
peter 3866
 
0.3%
william 3402
 
0.3%
george 3399
 
0.3%
Other values (112094) 1102819
95.3%
2023-06-10T19:23:11.468888image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1112000
 
11.5%
' 1109776
 
11.5%
a 697990
 
7.2%
e 659483
 
6.8%
n 519392
 
5.4%
, 514860
 
5.3%
r 492893
 
5.1%
i 479189
 
5.0%
o 419776
 
4.4%
l 363213
 
3.8%
Other values (384) 3276277
34.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5597512
58.0%
Other Punctuation 1651149
 
17.1%
Uppercase Letter 1179686
 
12.2%
Space Separator 1112000
 
11.5%
Open Punctuation 44859
 
0.5%
Close Punctuation 44845
 
0.5%
Dash Punctuation 14012
 
0.1%
Other Letter 543
 
< 0.1%
Decimal Number 115
 
< 0.1%
Final Punctuation 83
 
< 0.1%
Other values (4) 45
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 697990
12.5%
e 659483
11.8%
n 519392
9.3%
r 492893
 
8.8%
i 479189
 
8.6%
o 419776
 
7.5%
l 363213
 
6.5%
s 253368
 
4.5%
t 250955
 
4.5%
h 196075
 
3.5%
Other values (138) 1265178
22.6%
Other Letter
ValueCountFrequency (%)
ا 32
 
5.9%
م 31
 
5.7%
ع 19
 
3.5%
ی 19
 
3.5%
ن 18
 
3.3%
د 17
 
3.1%
ر 17
 
3.1%
17
 
3.1%
ي 16
 
2.9%
12
 
2.2%
Other values (104) 345
63.5%
Uppercase Letter
ValueCountFrequency (%)
M 108425
 
9.2%
S 91444
 
7.8%
C 83341
 
7.1%
J 82610
 
7.0%
B 81638
 
6.9%
A 70102
 
5.9%
R 66817
 
5.7%
D 65305
 
5.5%
L 60670
 
5.1%
G 54164
 
4.6%
Other values (81) 415170
35.2%
Other Punctuation
ValueCountFrequency (%)
' 1109776
67.2%
, 514860
31.2%
. 15905
 
1.0%
" 10549
 
0.6%
\ 32
 
< 0.1%
· 9
 
< 0.1%
& 6
 
< 0.1%
: 6
 
< 0.1%
! 5
 
< 0.1%
/ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 44
38.3%
5 37
32.2%
2 14
 
12.2%
1 8
 
7.0%
9 4
 
3.5%
3 2
 
1.7%
4 2
 
1.7%
7 2
 
1.7%
8 1
 
0.9%
6 1
 
0.9%
Nonspacing Mark
ValueCountFrequency (%)
́ 10
58.8%
2
 
11.8%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
1
 
5.9%
Open Punctuation
ValueCountFrequency (%)
[ 44837
> 99.9%
14
 
< 0.1%
( 8
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
74
89.2%
6
 
7.2%
» 3
 
3.6%
Close Punctuation
ValueCountFrequency (%)
] 44837
> 99.9%
) 8
 
< 0.1%
Initial Punctuation
ValueCountFrequency (%)
20
87.0%
« 3
 
13.0%
Space Separator
ValueCountFrequency (%)
1112000
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 14012
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 3
100.0%
Modifier Symbol
ValueCountFrequency (%)
´ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6774205
70.2%
Common 2867091
29.7%
Cyrillic 2979
 
< 0.1%
Han 276
 
< 0.1%
Arabic 241
 
< 0.1%
Thai 27
 
< 0.1%
Greek 14
 
< 0.1%
Inherited 10
 
< 0.1%
Hangul 6
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 697990
 
10.3%
e 659483
 
9.7%
n 519392
 
7.7%
r 492893
 
7.3%
i 479189
 
7.1%
o 419776
 
6.2%
l 363213
 
5.4%
s 253368
 
3.7%
t 250955
 
3.7%
h 196075
 
2.9%
Other values (163) 2441871
36.0%
Han
ValueCountFrequency (%)
17
 
6.2%
12
 
4.3%
11
 
4.0%
11
 
4.0%
11
 
4.0%
11
 
4.0%
11
 
4.0%
11
 
4.0%
9
 
3.3%
9
 
3.3%
Other values (55) 163
59.1%
Cyrillic
ValueCountFrequency (%)
а 316
 
10.6%
и 303
 
10.2%
о 227
 
7.6%
н 224
 
7.5%
р 209
 
7.0%
е 169
 
5.7%
л 149
 
5.0%
к 132
 
4.4%
т 114
 
3.8%
с 106
 
3.6%
Other values (51) 1030
34.6%
Common
ValueCountFrequency (%)
1112000
38.8%
' 1109776
38.7%
, 514860
18.0%
[ 44837
 
1.6%
] 44837
 
1.6%
. 15905
 
0.6%
- 14012
 
0.5%
" 10549
 
0.4%
74
 
< 0.1%
0 44
 
< 0.1%
Other values (24) 197
 
< 0.1%
Arabic
ValueCountFrequency (%)
ا 32
13.3%
م 31
12.9%
ع 19
 
7.9%
ی 19
 
7.9%
ن 18
 
7.5%
د 17
 
7.1%
ر 17
 
7.1%
ي 16
 
6.6%
ل 9
 
3.7%
ب 8
 
3.3%
Other values (18) 55
22.8%
Thai
ValueCountFrequency (%)
2
 
7.4%
2
 
7.4%
2
 
7.4%
2
 
7.4%
2
 
7.4%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (11) 11
40.7%
Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Greek
ValueCountFrequency (%)
ν 6
42.9%
Ζ 2
 
14.3%
α 2
 
14.3%
ο 2
 
14.3%
ί 2
 
14.3%
Inherited
ValueCountFrequency (%)
́ 10
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9603360
99.6%
None 37780
 
0.4%
Cyrillic 2979
 
< 0.1%
CJK 276
 
< 0.1%
Arabic 241
 
< 0.1%
Punctuation 114
 
< 0.1%
Latin Ext Additional 56
 
< 0.1%
Thai 27
 
< 0.1%
Diacriticals 10
 
< 0.1%
Hangul 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1112000
 
11.6%
' 1109776
 
11.6%
a 697990
 
7.3%
e 659483
 
6.9%
n 519392
 
5.4%
, 514860
 
5.4%
r 492893
 
5.1%
i 479189
 
5.0%
o 419776
 
4.4%
l 363213
 
3.8%
Other values (68) 3234788
33.7%
None
ValueCountFrequency (%)
é 8994
23.8%
á 4101
 
10.9%
í 2720
 
7.2%
ô 2319
 
6.1%
ö 1997
 
5.3%
ó 1852
 
4.9%
ü 1475
 
3.9%
ć 1352
 
3.6%
è 1212
 
3.2%
ä 991
 
2.6%
Other values (110) 10767
28.5%
Cyrillic
ValueCountFrequency (%)
а 316
 
10.6%
и 303
 
10.2%
о 227
 
7.6%
н 224
 
7.5%
р 209
 
7.0%
е 169
 
5.7%
л 149
 
5.0%
к 132
 
4.4%
т 114
 
3.8%
с 106
 
3.6%
Other values (51) 1030
34.6%
Punctuation
ValueCountFrequency (%)
74
64.9%
20
 
17.5%
14
 
12.3%
6
 
5.3%
Arabic
ValueCountFrequency (%)
ا 32
13.3%
م 31
12.9%
ع 19
 
7.9%
ی 19
 
7.9%
ن 18
 
7.5%
د 17
 
7.1%
ر 17
 
7.1%
ي 16
 
6.6%
ل 9
 
3.7%
ب 8
 
3.3%
Other values (18) 55
22.8%
CJK
ValueCountFrequency (%)
17
 
6.2%
12
 
4.3%
11
 
4.0%
11
 
4.0%
11
 
4.0%
11
 
4.0%
11
 
4.0%
11
 
4.0%
9
 
3.3%
9
 
3.3%
Other values (55) 163
59.1%
Latin Ext Additional
ValueCountFrequency (%)
15
26.8%
9
16.1%
6
 
10.7%
6
 
10.7%
ế 5
 
8.9%
4
 
7.1%
4
 
7.1%
4
 
7.1%
2
 
3.6%
1
 
1.8%
Diacriticals
ValueCountFrequency (%)
́ 10
100.0%
Thai
ValueCountFrequency (%)
2
 
7.4%
2
 
7.4%
2
 
7.4%
2
 
7.4%
2
 
7.4%
2
 
7.4%
1
 
3.7%
1
 
3.7%
1
 
3.7%
1
 
3.7%
Other values (11) 11
40.7%
Hangul
ValueCountFrequency (%)
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
1
16.7%
Distinct18622
Distinct (%)41.5%
Missing0
Missing (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:11.872386image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Length

Max length738
Median length530
Mean length18.886768
Min length2

Characters and Unicode

Total characters846826
Distinct characters206
Distinct categories10 ?
Distinct scripts6 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11971 ?
Unique (%)26.7%

Sample

1st row['John Lasseter']
2nd row['Joe Johnston']
3rd row['Howard Deutch']
4th row['Forest Whitaker']
5th row['Charles Shyer']
ValueCountFrequency (%)
john 1218
 
1.2%
michael 940
 
0.9%
918
 
0.9%
david 892
 
0.9%
robert 847
 
0.8%
peter 573
 
0.6%
william 553
 
0.5%
richard 538
 
0.5%
james 521
 
0.5%
paul 466
 
0.5%
Other values (18517) 95145
92.7%
2023-06-10T19:23:12.525255image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
' 96597
 
11.4%
57786
 
6.8%
e 56757
 
6.7%
a 56231
 
6.6%
] 44837
 
5.3%
[ 44837
 
5.3%
r 44156
 
5.2%
n 43746
 
5.2%
i 42466
 
5.0%
o 38282
 
4.5%
Other values (196) 321131
37.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 489921
57.9%
Other Punctuation 104492
 
12.3%
Uppercase Letter 103542
 
12.2%
Space Separator 57786
 
6.8%
Close Punctuation 44839
 
5.3%
Open Punctuation 44839
 
5.3%
Dash Punctuation 1380
 
0.2%
Other Letter 23
 
< 0.1%
Decimal Number 3
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 56757
11.6%
a 56231
11.5%
r 44156
 
9.0%
n 43746
 
8.9%
i 42466
 
8.7%
o 38282
 
7.8%
l 29818
 
6.1%
s 22586
 
4.6%
t 21495
 
4.4%
h 18109
 
3.7%
Other values (97) 116275
23.7%
Uppercase Letter
ValueCountFrequency (%)
M 9115
 
8.8%
S 8660
 
8.4%
J 7771
 
7.5%
R 6618
 
6.4%
C 6449
 
6.2%
B 6433
 
6.2%
A 6213
 
6.0%
D 5520
 
5.3%
L 5366
 
5.2%
G 4926
 
4.8%
Other values (53) 36471
35.2%
Other Letter
ValueCountFrequency (%)
ی 2
 
8.7%
م 2
 
8.7%
ا 2
 
8.7%
1
 
4.3%
1
 
4.3%
پ 1
 
4.3%
ن 1
 
4.3%
ع 1
 
4.3%
د 1
 
4.3%
1
 
4.3%
Other values (10) 10
43.5%
Other Punctuation
ValueCountFrequency (%)
' 96597
92.4%
, 4405
 
4.2%
. 3076
 
2.9%
" 402
 
0.4%
\ 11
 
< 0.1%
· 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 1
33.3%
5 1
33.3%
9 1
33.3%
Close Punctuation
ValueCountFrequency (%)
] 44837
> 99.9%
) 2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
[ 44837
> 99.9%
( 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
57786
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1380
100.0%
Math Symbol
ValueCountFrequency (%)
| 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 593275
70.1%
Common 253340
29.9%
Cyrillic 188
 
< 0.1%
Arabic 10
 
< 0.1%
Han 10
 
< 0.1%
Hangul 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 56757
 
9.6%
a 56231
 
9.5%
r 44156
 
7.4%
n 43746
 
7.4%
i 42466
 
7.2%
o 38282
 
6.5%
l 29818
 
5.0%
s 22586
 
3.8%
t 21495
 
3.6%
h 18109
 
3.1%
Other values (123) 219629
37.0%
Cyrillic
ValueCountFrequency (%)
и 22
 
11.7%
о 15
 
8.0%
е 14
 
7.4%
а 14
 
7.4%
р 13
 
6.9%
к 13
 
6.9%
л 13
 
6.9%
н 11
 
5.9%
д 9
 
4.8%
в 6
 
3.2%
Other values (27) 58
30.9%
Common
ValueCountFrequency (%)
' 96597
38.1%
57786
22.8%
] 44837
17.7%
[ 44837
17.7%
, 4405
 
1.7%
. 3076
 
1.2%
- 1380
 
0.5%
" 402
 
0.2%
\ 11
 
< 0.1%
) 2
 
< 0.1%
Other values (6) 7
 
< 0.1%
Han
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Arabic
ValueCountFrequency (%)
ی 2
20.0%
م 2
20.0%
ا 2
20.0%
پ 1
10.0%
ن 1
10.0%
ع 1
10.0%
د 1
10.0%
Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 842513
99.5%
None 4099
 
0.5%
Cyrillic 188
 
< 0.1%
Arabic 10
 
< 0.1%
CJK 10
 
< 0.1%
Latin Ext Additional 3
 
< 0.1%
Hangul 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
' 96597
 
11.5%
57786
 
6.9%
e 56757
 
6.7%
a 56231
 
6.7%
] 44837
 
5.3%
[ 44837
 
5.3%
r 44156
 
5.2%
n 43746
 
5.2%
i 42466
 
5.0%
o 38282
 
4.5%
Other values (57) 316818
37.6%
None
ValueCountFrequency (%)
é 968
23.6%
á 406
 
9.9%
ö 268
 
6.5%
í 247
 
6.0%
ó 235
 
5.7%
ô 163
 
4.0%
ä 152
 
3.7%
è 128
 
3.1%
ü 116
 
2.8%
ç 110
 
2.7%
Other values (69) 1306
31.9%
Cyrillic
ValueCountFrequency (%)
и 22
 
11.7%
о 15
 
8.0%
е 14
 
7.4%
а 14
 
7.4%
р 13
 
6.9%
к 13
 
6.9%
л 13
 
6.9%
н 11
 
5.9%
д 9
 
4.8%
в 6
 
3.2%
Other values (27) 58
30.9%
Arabic
ValueCountFrequency (%)
ی 2
20.0%
م 2
20.0%
ا 2
20.0%
پ 1
10.0%
ن 1
10.0%
ع 1
10.0%
د 1
10.0%
Latin Ext Additional
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
CJK
ValueCountFrequency (%)
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
1
10.0%
Hangul
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%

release_month
Real number (ℝ)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.4642371
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:12.718116image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median7
Q310
95-th percentile12
Maximum12
Range11
Interquartile range (IQR)7

Descriptive statistics

Standard deviation3.627512
Coefficient of variation (CV)0.5611663
Kurtosis-1.3242628
Mean6.4642371
Median Absolute Deviation (MAD)3
Skewness-0.073011831
Sum289837
Variance13.158843
MonotonicityNot monotonic
2023-06-10T19:23:12.879846image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
1 5816
13.0%
9 4786
10.7%
10 4559
10.2%
12 3754
8.4%
11 3622
8.1%
3 3512
7.8%
4 3411
7.6%
8 3359
7.5%
5 3313
7.4%
6 3105
6.9%
Other values (2) 5600
12.5%
ValueCountFrequency (%)
1 5816
13.0%
2 2996
6.7%
3 3512
7.8%
4 3411
7.6%
5 3313
7.4%
6 3105
6.9%
7 2604
5.8%
8 3359
7.5%
9 4786
10.7%
10 4559
10.2%
ValueCountFrequency (%)
12 3754
8.4%
11 3622
8.1%
10 4559
10.2%
9 4786
10.7%
8 3359
7.5%
7 2604
5.8%
6 3105
6.9%
5 3313
7.4%
4 3411
7.6%
3 3512
7.8%

release_year
Real number (ℝ)

Distinct135
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1991.8344
Minimum1874
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size350.4 KiB
2023-06-10T19:23:13.088418image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum1874
5-th percentile1941
Q11978
median2001
Q32010
95-th percentile2015
Maximum2020
Range146
Interquartile range (IQR)32

Descriptive statistics

Standard deviation24.005446
Coefficient of variation (CV)0.012051929
Kurtosis0.79577462
Mean1991.8344
Median Absolute Deviation (MAD)12
Skewness-1.2142281
Sum89307879
Variance576.26143
MonotonicityNot monotonic
2023-06-10T19:23:13.304491image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2014 1953
 
4.4%
2015 1889
 
4.2%
2013 1875
 
4.2%
2012 1710
 
3.8%
2011 1651
 
3.7%
2009 1574
 
3.5%
2016 1560
 
3.5%
2010 1477
 
3.3%
2008 1453
 
3.2%
2007 1305
 
2.9%
Other values (125) 28390
63.3%
ValueCountFrequency (%)
1874 1
 
< 0.1%
1878 1
 
< 0.1%
1883 1
 
< 0.1%
1887 1
 
< 0.1%
1888 2
 
< 0.1%
1890 5
< 0.1%
1891 6
< 0.1%
1892 3
 
< 0.1%
1893 1
 
< 0.1%
1894 12
< 0.1%
ValueCountFrequency (%)
2020 1
 
< 0.1%
2018 5
 
< 0.1%
2017 451
 
1.0%
2016 1560
3.5%
2015 1889
4.2%
2014 1953
4.4%
2013 1875
4.2%
2012 1710
3.8%
2011 1651
3.7%
2010 1477
3.3%

Interactions

2023-06-10T19:22:57.804220image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:41.599843image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:44.077549image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:46.343178image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:48.197889image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:50.094544image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:52.055723image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:53.890694image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:55.627078image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:58.051989image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:41.990481image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:44.340302image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:46.541995image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:48.386868image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:50.300169image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:52.242959image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:54.074051image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:55.806712image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:58.284775image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:42.290206image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:44.737936image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:46.760792image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:48.595675image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:50.542628image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:52.458490image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:54.273237image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:56.016519image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:58.480422image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:42.514999image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:44.965723image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:46.962491image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:48.803484image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:50.733817image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:52.672007image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:54.470054image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:56.218332image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:58.682318image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:42.914624image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:45.226481image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:47.162305image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:49.002639image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:51.009560image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:52.873713image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:54.654867image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:56.447117image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:58.877137image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:43.261302image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:45.448275image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:47.358125image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:49.201372image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:51.208380image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:53.071528image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:54.841180image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:56.855739image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:59.086942image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:43.469110image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:45.675796image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:47.567971image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:49.405183image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:51.410481image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:53.283331image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:55.040996image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:57.060371image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:59.286756image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:43.674927image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:45.881603image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:47.775032image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:49.640965image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:51.665244image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:53.482075image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:55.231820image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:57.350102image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:59.489111image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:43.868742image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:46.106395image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:47.976093image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:49.847772image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:51.846079image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:53.685884image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:55.423268image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-10T19:22:57.552455image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-06-10T19:23:13.517296image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
idbudgetpopularityruntimevote_averagevote_countid_collectionrelease_monthrelease_year
id1.000-0.256-0.412-0.204-0.150-0.4340.221-0.0140.388
budget-0.2561.0000.4650.2270.0720.486-0.1290.0460.143
popularity-0.4120.4651.0000.3060.2410.894-0.1540.0720.186
runtime-0.2040.2270.3061.0000.1950.290-0.1020.0710.034
vote_average-0.1500.0720.2410.1951.0000.317-0.0530.048-0.009
vote_count-0.4340.4860.8940.2900.3171.000-0.1540.0640.200
id_collection0.221-0.129-0.154-0.102-0.053-0.1541.000-0.0250.116
release_month-0.0140.0460.0720.0710.0480.064-0.0251.000-0.015
release_year0.3880.1430.1860.034-0.0090.2000.116-0.0151.000

Missing values

2023-06-10T19:22:59.826797image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-06-10T19:23:00.388278image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-06-10T19:23:00.829696image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

idtitlebudgetoriginal_languagepopularityruntimevote_averagevote_countid_collectionname_genresId_production_companiesName_castDirectorrelease_monthrelease_year
0862Toy Story30000000.0en21.94694381.07.75415.010194.0['Animation', 'Comedy', 'Family'][3]['Tom Hanks', 'Tim Allen', 'Don Rickles', 'Jim Varney', 'Wallace Shawn', 'John Ratzenberger', 'Annie Potts', 'John Morris', 'Erik von Detten', 'Laurie Metcalf', 'R. Lee Ermey', 'Sarah Freeman', 'Penn Jillette']['John Lasseter']101995
18844Jumanji65000000.0en17.015539104.06.92413.0NaN['Adventure', 'Fantasy', 'Family'][559, 2550, 10201]['Robin Williams', 'Jonathan Hyde', 'Kirsten Dunst', 'Bradley Pierce', 'Bonnie Hunt', 'Bebe Neuwirth', 'David Alan Grier', 'Patricia Clarkson', 'Adam Hann-Byrd', 'Laura Bell Bundy', 'James Handy', 'Gillian Barber', 'Brandon Obray', 'Cyrus Thiedeke', 'Gary Joseph Thorup', 'Leonard Zola', 'Lloyd Berry', 'Malcolm Stewart', 'Annabel Kershaw', 'Darryl Henriques', 'Robyn Driscoll', 'Peter Bryant', 'Sarah Gilson', 'Florica Vlad', 'June Lion', 'Brenda Lockmuller']['Joe Johnston']121995
215602Grumpier Old Men0.0en11.712900101.06.592.0119050.0['Romance', 'Comedy'][6194, 19464]['Walter Matthau', 'Jack Lemmon', 'Ann-Margret', 'Sophia Loren', 'Daryl Hannah', 'Burgess Meredith', 'Kevin Pollak']['Howard Deutch']121995
331357Waiting to Exhale16000000.0en3.859495127.06.134.0NaN['Comedy', 'Drama', 'Romance'][306]['Whitney Houston', 'Angela Bassett', 'Loretta Devine', 'Lela Rochon', 'Gregory Hines', 'Dennis Haysbert', 'Michael Beach', 'Mykelti Williamson', 'Lamont Johnson', 'Wesley Snipes']['Forest Whitaker']121995
411862Father of the Bride Part II0.0en8.387519106.05.7173.096871.0['Comedy'][5842, 9195]['Steve Martin', 'Diane Keaton', 'Martin Short', 'Kimberly Williams-Paisley', 'George Newbern', 'Kieran Culkin', 'BD Wong', 'Peter Michael Goetz', 'Kate McGregor-Stewart', 'Jane Adams', 'Eugene Levy', 'Lori Alan']['Charles Shyer']21995
5949Heat60000000.0en17.924927170.07.71886.0NaN['Action', 'Crime', 'Drama', 'Thriller'][508, 675, 6194]['Al Pacino', 'Robert De Niro', 'Val Kilmer', 'Jon Voight', 'Tom Sizemore', 'Diane Venora', 'Amy Brenneman', 'Ashley Judd', 'Mykelti Williamson', 'Natalie Portman', 'Ted Levine', 'Tom Noonan', 'Tone Loc', 'Hank Azaria', 'Wes Studi', 'Dennis Haysbert', 'Danny Trejo', 'Henry Rollins', 'William Fichtner', 'Kevin Gage', 'Susan Traylor', 'Jerry Trimble', 'Ricky Harris', 'Jeremy Piven', 'Xander Berkeley', 'Begonya Plaza', 'Rick Avery', 'Hazelle Goodman', 'Ray Buktenica', 'Max Daniels', 'Vince Deadrick Jr.', 'Steven Ford', 'Farrah Forke', 'Patricia Healy', 'Paul Herman', 'Cindy Katz', 'Brian Libby', 'Dan Martin', 'Mario Roberts', 'Thomas Rosales, Jr.', 'Yvonne Zima', 'Mick Gould', 'Bud Cort', 'Viviane Vives', 'Kim Staunton', 'Martin Ferrero', 'Brad Baldridge', 'Andrew Camuccio', 'Kenny Endoso', 'Kimberly Flynn', 'Niki Harris', 'Bill McIntosh', 'Rick Marzan', 'Terry Miller', "Daniel O'Haco", 'Kai Soremekun', 'Peter Blackwell', 'Trevor Coppola', 'Mary Kircher', 'Darin Mangan', 'Robert Miranda', 'Manny Perry', 'Iva Franks Singer', 'Tim Werner', 'Philip Ettington']['Michael Mann']121995
611860Sabrina58000000.0en6.677277127.06.2141.0NaN['Comedy', 'Romance'][4, 258, 932, 5842, 14941, 55873, 58079]['Harrison Ford', 'Julia Ormond', 'Greg Kinnear', 'Angie Dickinson', 'Nancy Marchand', 'John Wood', 'Richard Crenna', 'Lauren Holly', 'Dana Ivey', 'Fanny Ardant', 'Patrick Bruel', 'Paul Giamatti', 'Miriam Colón', 'Elizabeth Franz', 'Valérie Lemercier', 'Becky Ann Baker', 'John C. Vennema', 'Margo Martindale', 'J. Smith-Cameron', 'Christine Luneau-Lipton', 'Michael Dees', 'Denis Holmes', 'Jo-Jo Lowe', 'Ira Wheeler', 'Philippa Cooper', 'Ayako Kawahara', 'François Genty', 'Guillaume Gallienne', 'Inés Sastre', 'Phina Oruche', 'Andrea Behalikova', 'Jennifer Herrera', 'Kristina Kumlin', 'Eva Linderholm', 'Carmen Chaplin', 'Micheline Van de Velde', 'Joanna Rhodes', 'Alan Boone', 'Patrick Forster-Delmas', 'Kentaro Matsuo', 'Peter McKernan', 'Ed Connelly', 'Ronald L. Schwary', 'Alvin Lum', 'Siching Song', 'Phil Nee', 'Randy Becker', 'Susan Browning', 'Anthony Mondal', 'Peter Parks', 'Woodrow Asai', 'Eric Bruno Borgman', 'Michael Cline', 'Christopher Del Gaudio', 'Philippe Hartmann', 'Jerry Quinn', 'Dori Rosenthal']['Sydney Pollack']121995
745325Tom and Huck0.0en2.56116197.05.445.0NaN['Action', 'Adventure', 'Drama', 'Family'][2]['Jonathan Taylor Thomas', 'Brad Renfro', 'Rachael Leigh Cook', 'Michael McShane', 'Amy Wright', 'Eric Schweig', 'Tamara Mello']['Peter Hewitt']121995
89091Sudden Death35000000.0en5.231580106.05.5174.0NaN['Action', 'Adventure', 'Thriller'][33, 21437, 23770]['Jean-Claude Van Damme', 'Powers Boothe', 'Dorian Harewood', 'Raymond J. Barry', 'Ross Malinger', 'Whittni Wright']['Peter Hyams']121995
9710GoldenEye58000000.0en14.686036130.06.61194.0645.0['Adventure', 'Action', 'Thriller'][60, 7576]['Pierce Brosnan', 'Sean Bean', 'Izabella Scorupco', 'Famke Janssen', 'Joe Don Baker', 'Judi Dench', 'Gottfried John', 'Robbie Coltrane', 'Alan Cumming', 'Tchéky Karyo', 'Desmond Llewelyn', 'Samantha Bond', 'Michael Kitchen', 'Serena Gordon', 'Simon Kunz', 'Billy J. Mitchell', 'Constantine Gregory', 'Minnie Driver', 'Michelle Arthur', 'Ravil Isyanov']['Martin Campbell']111995
idtitlebudgetoriginal_languagepopularityruntimevote_averagevote_countid_collectionname_genresId_production_companiesName_castDirectorrelease_monthrelease_year
4482745992812 Feet Deep0.0en4.47953685.05.162.0NaN['Animation', 'Music'][]['Konsta Hietanen', 'Risto Tuorila', 'Jarmo Mäkinen', 'Antti Virmavirta', 'Kristiina Halttu', 'Rauno Ahonen']['Matt Eskandari']12016
44828258514Van Gogh: Painted with Words0.0en2.07100380.06.711.0NaN['Horror', 'Fantasy'][5062]['Kathy Bates', 'Victor Garber', 'Alan Cumming', 'Audra McDonald', 'Kristin Chenoweth', 'Erin Adams', 'Sarah Hyland', 'Lalaine', 'Nanea Miyata', 'Marissa Rago', 'Danelle Wilson', 'Andrea McArdle', 'Alicia Morton', 'Dennis Howard', 'Douglas Fisher', 'Kurt Knudson', 'Brooks Almy', 'Ruth Gottschall', 'Tom Billett', 'Frank Cavestani', 'Ellen Gerstein', 'David Pevsner', 'Ed Francis Martin', 'Bob Morrisey']['Andrew Hutton']42010
44829382455Don't Call Me Son0.0pt0.55219982.06.713.0NaN['Action', 'Crime', 'History'][670, 8797, 10470, 10471]['Nicolas Cage', 'Gina Gershon', 'Nicky Whelan', 'Faye Dunaway', 'Natalie Nelson', 'James Van Patten', 'Jonathan Baker', 'Leah Huebner', 'Ele Bardha', 'Corrie Danieley']['Anna Muylaert']72016
44830217917The Wrong Road0.0en0.31643262.05.01.0NaN['Documentary'][73685, 89911]['Pirkka-Pekka Petelius', 'Paavo Kerosuo', 'Pekka Strang', 'Johanna af Schultén', 'Cecilia Paul', 'Emil Lundberg', 'Peter Franzén', 'Alexander Skarsgård']['James Cruze']101937
4483142616The Virginian0.0en0.03726491.08.01.0NaN['Adventure', 'Drama', 'Family'][]['Christopher Plummer', 'Tom Bosley', 'Bob Elliott', 'Ray Goulding', 'Frank Gorshin', 'Tony Randall']['Victor Fleming']111929
4483274384San Giovanni decollato0.0it0.4009520.05.73.0NaN['Music', 'Family', 'Comedy'][]['Patrick Huard', 'Colm Feore', 'Erik Knudsen', 'Noam Jenkins', 'Sarah-Jeanne Labrosse', 'Lucie Laurier', 'Andre Bedard']['Amleto Palermi', 'Giorgio Bianchi']121940
4483364043I due orfanelli0.0it0.1992140.03.54.0NaN['Thriller', 'Drama'][7177, 68273]['Gloria Blondell']['Mario Mattoli']11947
4483470207The Crooked E: The Unshredded Truth About Enron0.0en0.085047100.02.52.0NaN[][3166]['Antonio Banderas', 'Ben Kingsley', 'Liam McIntyre', 'Chad Lindberg', 'Gabriella Wright', 'Cung Le', 'Mark Smith', 'Bashar Rahal', 'Yana Marinova', 'Jiro Wang', 'Ivailo Dimitrov', 'Velimir Velev', 'Mark Basnight', 'Lillian Blankenship', 'Katherine de la Rocha', 'Shari Watson']['Penelope Spheeris']12003
4483529458One Hundred Steps0.0it4.675250114.07.8116.0104774.0['Animation', 'Family'][25473, 74795]['Steve John Shepherd', 'Ben Waters', 'Alec Newman', 'Chiwetel Ejiofor', 'Anjela Lauren Smith', 'Melanie Gutteridge', 'Georgia Mackenzie', 'Alicya Eyo', 'Freddie Annobil-Dodoo', 'Alun Armstrong', 'John Blundell', 'Karl Collins', 'Robbie Gee']['Marco Tullio Giordana']82000
44836160788Color of the Ocean0.0de0.11925295.03.02.0417491.0['Crime', 'Comedy', 'Action'][8833][]['Maggie Peren']32012